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ABSTRACT 



This paper considered several issues with the analysis and 
interpretation of interactions in unbalanced factorial designs. The effect of 
design weights on the interaction parameters of factorial designs and an 
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an unbalanced nonorthogonal design resulted. To analyze this data, one 
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selected. Implications of these findings are discussed. (Contains 38 
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Abstract 



This paper considers several issues with the analysis and interpretation of interactions in unbalanced 
factorial designs. The effect of design weights on the interaction parameters in factorial designs and an 
approach for the analysis of interactions using finite intersection tests is discussed. 

To motivate and illustrate the issues discussed in this article, we begin with a simple example. 

Consider the problem of a researcher designing a study to analyze the interaction of two treatment 
combinations (factors A and B) where factor A has three levels and factor B has four levels. Following 
Kirk (1995, Chapter 9), a completely randomized factorial design, represented as CRF-ab where 
a = 3 and b = 4. Kirk (1995, p. 422) and most other authors of texts in classical experimental design in the 
social and behavioral sciences, biostatistics and statistics recommend that the researcher (1) assume a 
balanced design with an equal number of observations per cell, (2) establish the magnitude of the 
interactions (Yij) one wishes to detect with the desired power for the overall test of the interaction, with an 

estimate of common within subject variance, and (3) determine the sampling plan for the study. 
Alternatively, following Cohen (1988), an estimate of the effect size for the test of interaction is specified to 
again determine the equal number of observations per cell. 

Following the procedure discussed by Kirk (1995, p. 401), the power analysis of a researcher showed 
that exactly 2.75 subjects per cell or N = 33 subjects were required for a study to maintain a 0.80 level of 
power to reject the overall test of interaction (AB). While three observations per cell yielded a slightly 
higher power, the researcher knew that a design with an equal number of observations per cell was “easier” 
to analyze; hence, the researcher decided on a sample size of N = 36 subjects. 

A second researcher knew that factors A and B contained mutually exclusive and exhaustive treatment 
levels associated with the two factors. Furthermore, this researcher knew that the number of observations 
associated with the levels of factor A in the population were equal and that the number associated with a 
levels of factor B were proportional in the population with fixed ratios 2:3:2:4. Using this information, the 
researcher decided on the sampling plan shown in Table 1, an orthogonal design with proportional cell 
frequencies. 
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Table 1 Design Weights W tj 
Factor B 





3 


B 2 


B 3 


B 4 




^1 


2 


3 


2 


4 


11 


Factor A A 2 


2 


3 


2 


4 


11 


^3 


2 


3 


2 


4 


11 




6 


9 


6 


12 


33 



Thus, we have two design strategies: an “equal” weight design and a “proportional weight” or “balanced” 
design. 

Collecting the data for the study and performing the experiment with N = 36 subjects, several data 
values were lost and the data realized for the experiment were those shown in Table 2, from Overall and 
Spiegel (1969); an unbalanced or nonorthogonal design. The second researcher began his study with 
N = 33 subjects according to the sampling plan given in Table 1; however, one subject left the experiment 
and some of the subjects were incorrectly assigned to the treatment combinations. Again, the data shown in 
Table 2 were realized. Finding yourself in this situation, how would you analyze the data in Table 2? 
Should you employ equal weights, balanced weights, unequal weights or the cell frequencies as weights - 
“sample” weights? Does the selection of weights make a difference for testing the interaction hypothesis or 
estimating the interaction effects? 

Table 2 Overall and Spiegel Data 
Factor B 

B\ Z ? 2 #3 B 4 






Factor A 

A 2 



^3 



61 

73 

52 


79 

65 

81 


43 

35 


56 

25 

19 

35 


42 


37 


87 


72 


53 


32 


81 


84 




50 


65 




96 


45 


75 


98 


81 


37 


59 


77 


92 






91 



Before answering this question, let us see what others say about the analysis of a two-way CRF-34 
analysis of variance (ANOVA) with fixed effects and unbalanced data as shown in Table 2. While there has 
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3 



been considerable discussion of this situation in the social and behavioral sciences by many authors, for 
example, Appelbaum and Cramer (1974), Carlson and Timm (1974), Cramer and Appelbaum (1980), Keren 
and Lewis (1977), Lewis and Keren (1977), O’Brien (1976), Overall and Spiegel (1969, 1973), Overall, 
Spiegel, and Cohen (1975), Rawlings (1972, 1973), Searle (1995), and Timm and Carlson (1975), among 
others, there is not total agreement among the authors on the analysis. Most authors agree that selecting 
sample weights, weighting by cell frequencies, is incorrect unless the unbalanced number of observations 
observed in all cells would result in proportional patterns over several replications of the experiment. 
Following Scheffe (1959, p. 93), others might say that the test of interaction does not depend on the system 
of weights. Only the tests of main effects depend on the weights. This is correct; however, the interaction 
parameters Yij and their estimators, y^-, depend on the weights used in the design. This led Arnold (1981, 

pp. 92-100), Davidson and Toporek (1991) and Fujikoshi (1993) to assert that the researcher first establish 
the system of design weights or restrictions on the model. Following the establishment of design weights or 
model restrictions, one next determines the sampling plan for the design based on power considerations. 
Unfortunately, there is no “optimal” strategy in the selection of the system of design weights, Fujikoshi 
(1993). 

Returning to our example, we would not weight by the sample cell frequencies or use weighted 
averages of cell means since the n y are not random. The first researcher would select an equal cell weight 

analysis which associates weights (1 / a) to each row of factor A and (1 / b) to each row of factor B, 
provided the treatment levels are mutually exclusive and exhaustive for both factors as suggested by 
Carlson and Timm (1974), among others. For the second researcher, proportional or balanced weights 
should be used in the analysis since they reflect the proportion of the population in the 7 th column. Does the 
analysis make a difference? Yes, before we look at four approaches and the difference in the analysis, we 
introduce some notation. 

The Two-Way ANOVA Model 

For our study we have a CRF-ab design; the full rank cell means linear model is: 

y ijk = /iy 4 £ jjfc , i = 1,2, . . . , Qy j 1,2, . . . , b, k 1,2, . . . , tijj , ( 1) 

where the e^ k are independent, normal, random errors with mean zero and common variance, a . 
Furthermore, we assume that the cell means fly have the following form: 

Hij =M + 0f i + ^y +7i/» (2) 

where /i is a general constant, a, is the effect due to the I th level of factor A, is the effect due to the 

7 th level of factor B and Yij represents the effect of the interaction of row i and column j. To associate 
meaning to the parameters in ( 2 ), to make them uniquely estimable, one imposes a set of restrictions on the 
parameters: 

E or , t = £ P j = I Yij = X Yij = 0- or (3) 

i j i J 
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4 



( 4 ) 



Z U,a, = X Vjpj = X Uju = X VjY v = 0 
* J * J 

where the weights £/, and Vy are nonnegative such that the X £/ > 0 and £ Vj > 0, or 

* J 

X = X W +j pj = £ ^y jy = £ WyYv = 0 (5) 

* J 1 J 

where the are nonnegative cell weights, W i+ = £ V^y and the W +J = 'LW ij . These systems of weighting 

j i 

schemes or restrictions are called £ -restrictions, £/V-restrictions and W-restrictions by Fujikoshi (1993). 



In (3) we see that selecting £/, = V ( = 1 or Wy f = 1 so that equal weight £ -restrictions is a special case 
of the UV- or W-restrictions, and Wy = UjVj. If Wy is proportional to the product of f/, and V Jy we say the 

design is “balanced” with respect to the weighting scheme. In particular, if we let 

£ W tj = W i+ and U- x = W i+ / 
j 

XWy = W +j and Vj = W+ j /W ++ (6) 

£ Wy = W++ and £^=£V y =l, 
ij i j 

then Wy oc U{Vj and the scheme is balanced. We may also select Wy = tiy so that W ++ = n ++ and other 
more abstract schemes. In general, £/V-restrictions are not special cases of W-restrictions, unless all 
£/, and Vj are positive, Fujikoshi (1993). Furthermore, it is not possible to develop simple expressions for 

all the model parameters /i, a i9 /3 ; and y y for the general W-restrictions unless the weights are balanced 
since solutions for the a { ' s and /? y ’ s involve solving the absorbing equations, Searle (1971, p. 297). 

For the weighting scheme Wy = U i - Vj = (1 / a)(l / b), which ensures positive, balanced £/, and V y , the 
estimator of the interaction parameter is 



7 ij = ty-Zty./a-Xfy/b + X£ty. /ab 
j * * j 

= yij--yi..-y'j- + y~. 



(7) 



where y,y. is the cell mean. The “dot” notation is used to represent a simple or unweighted average. For 
= n i+ and Vj = n+j or = n i+ n +J so that Wy > 0 and balanced, the estimator of the interaction term is 



Y y = Yij- ~ 2 n+jYij. /«++“£ n i+yij . /n ++ -XZ n y yy. / n ++ 
J * • j 



( 8 ) 



where the “bar” notation represents a weighted average of cell means. In (7), we have unweighted averages 
of cell means so that this is often termed the equal weight case. Situation (8) is called the unequal weighted 
case, since the y,y depend on weighted averages of cell means. More generally, suppose we have arbitrary 
weights where the weighting is specified by (5) with weights {Wy} where not all Wy > 0 and the are not 
balanced. Then, 
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( 9 ) 



i j 

Yij = y u .-fi-oc i -pj 

where Z U^ + a ( = 0, the Z ^+jPj - 0 and &i and Pj are solutions to the general absorbing equations, 

i ; 

Fujikoshi (1993). If Wy = n y = n (+ n + y /n++, we have the case of proportional sampling. This is not the 
case for our example since Wy = W i+ W + j / W++ and Wy * tiy. We call this situation the “balanced” weight 

case, since Wy oc W i+ W+j . For Wy = n y, a i and Pj do not have simple representations and (8) does not 
apply. 

Using the population cell means Hy y the corresponding row and column population marginal means are 
defined: 

Row 

Hi. = XHij /b 
_ j 

Hi. = 'Ln ij Hij / «,+ 

Hi=iw ij Hij/W i+ 
j 

For the model given in (1) and all n y > 0, Graybill (1976, p. 560) has shown that the test for no interaction 
or additivity may be represented in any of the following four equivalent forms: 

(a) H 0 : Yu= 0 

(b) H 0 : fly - H +a i + Pj 

(c) H o- Ha ~ Hfj - Hy + Hr/ = 0 

(d) H 0 : Yy ~ Yrj ~ Yy + Yr/ = 0 



Column 
Hj = 'LH i j/ a 

t 

H.j = ZtyHij / n + j 

H.j = -LW ij H i j/W +j . 



( 10 ) 



for all subscripts /, j and j'. 

To test (11), one uses the /^-statistic: 

z n tJ ( yij . - H - «, - Pj ) 2 / (a - l)(b - 1) 

F=~ = ^-. ( 12 ) 

hi (yy-y/) 2 /(N-ab) MS ‘ 

i j k = 1 

When H 0 as in (11) is true, the F-statistic in (12) has a central F-distribution with v h = (a - l)(b-l) and 
v e = N - ab degrees of freedom. For a deviation of the statistic in (12), see for example, Scheff6 (1959, p. 

1 15), Arnold (1981, p. 96) or Fujikoshi (1993). The test is a uniformly most powerful invariant, unbiased 
test, Arnold (1981, p. 109). 

While the F - test of (1 1) using (12) is independent of the system of weights (Scheffe, 1959, p. 93; 
Arnold, 1981, p. 95 and Fujikoshi, 1993, Theorem 3.2, p. 320), if the interaction hypothesis is rejected so 
that not all the Yy s are identically zero, the estimands Yy depend on the design weights and hence so may 

the interpretation of their estimates, Yy 
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Analyzing the Two-Way ANOVA 

Returning to our example, we analyze the data in Table 2 using the BMDP4V computer package, 
BMDP(1992). This statistical package allows the researcher to input design weights using the /WEIGHTS 
paragraph. The parameter EQUAL assigns cell weights Wy = (1 / a)(l / b). The option SIZES assigns cell 
weights Wy = n t j , a special case of the W-restrictions discussed by Fujikoshi (1993). With the LIST option 
one may assign weights to each level of the design. Using the weighting A = 1 1 , 1 1, 1 1 and B = 6, 9, 6, 12, 
we obtain the unequal weighted case. Selecting the weights proportional to cell frequencies, 

A = 12, 10, 10 and B = 8, 8, 7, 9, we obtain the “weighted” average case. The unequal weight case yield 



interactions that have the form given in (8). The parameter matrices for the cell means and interaction 
parameters are the axb matrices: 

U = (/x, 7 )andr = (y ij ) (13) 

where fly is a cell mean and y y is an interaction, and i = 1, 2, 3 and j = 1, 2, 3, 4, for our example. 



For the data in Table 2, the estimator of U is the matrix of sample cell means yy . ; hence, 



U = (*,) = 



"62.00 75.00 39.00 33.75" 
47.50 39.67 77.67 78.00 
^89.67 41.00 67.00 88.67 , 



(14) 



From the output of BMDP4V, one may construct the matrix of estimated interactions, f . Using 
estimates of the population marginal means given in (10) and an estimate of the overall mean fi y the 
interactions Yy are obtained using (7) for the “equal” or unweighted weight case. Using (8), we obtain the 
unequal weight case. BMDP4V solves the general absorbing equations to estimate Yy when selecting 
weights Wy = riy , the “sample” weight case. For the “balance” weight case, 



Y ij yy. A^i. 4" 2 - £ Wyyy / W + . 

i j 



(15) 



where Wy = W i+ W + j . The twelve interaction estimates fy are the cells in Table 3 for the four weightings: 

equal, sample, balanced and unequal. 

Table 3 3x4 Design, Interactions (y^), Marginal Means, Overall Mean 







Equal Weight Case (Wy = 1) 




Hi. 




4.7500 


32.2500 


-13.0833 


-23.9167 


52.4375 




-18.0208 


-11.3542 


17.3125 


12.0625 


60.7083 




13.2708 


-20.8958 


-4.2292 


11.8542 


71.5833 


A 1 J 


66.3889 


51.8889 


61.2222 


66.8056 


61.5763 




6 



7 



Sample Weight Case ( Wy = riy ) 



3.8591 

-19.5090 

9.1469 


30.6703 

-13.5312 

-25.7086 


-14.3769 

15.4217 

-8.7557 


-18.7086 

16.6733 

13.8293 












61.8125 




Balanced Weight Case ( W \j = W i+ W + j) 




h 


6.1667 


33.6667 


-11.6667 


-22.5000 


51.0909 


-19.1818 


-12.5152 


16.1516 


10.9015 


61.9394 


13.0152 


-21.1515 


-4.4848 


11.5985 


71.9091 



h 


66.3889 


51.8889 


61.2222 


66.8056 


61.6465 






Unequal Weight Case (Wy = n f+ n +J - ) 




Mi. 




4.7705 


30.5518 


-11.9482 


-22.1045 


52.2734 




-18.1748 


-13.2269 


18.2731 


13.7002 


60.7189 




12.4502 


-23.4352 


-3.9352 


12.8252 


72.2604 


N 


66.1146 


53.3333 


59.8333 


64.7396 


61.1585 



As expected, we see from Table 3 that the interaction estimates, the row and column population 
marginal mean estimates, and the estimate of the overall mean all depend on the design weights. However, 
the F-test for testing the overall significance of the AB interactions as given in (1 1) 



F=MS H /MS e 



9560.00/6 1593.33 

2269.92/ 20 ” 113.50 



(16) 



is independent of the system of design weights. Setting a = 0.05, the critical value for the F-test is 2.60, 
with v h = 6 and v e = 20 degrees of freedom. Since the null hypothesis is rejected, we see that the 
individual estimates of interaction and hence their interpretation are affected by the weights. 

Investigating the interaction estimates in Table 3, except for the situation Wy = n i} > 0, each y y is 

estimating a population parameter that has the following general structure: 

Yij — M// ~ marginal row mean — marginal column mean + n (17) 

where Hy is a population cell mean, fi is an overall mean, and the marginal means are defined in (10). This 
follows from (7) and (8) with 2 -restrictions (3) and //^-restrictions (4), respectively. With W-restrictions 
(5) and Wy « W i+ W +j , (9) reduces to (15) which has the form of (17). These observations lead to the 



following general theorem. 

Theorem 1. Using 2 -restrictions, Z/V-restrictions or W-restrictions with Wy <* W i+ W +j , the 
interaction parameters have the residual form given in (17) for appropriately defined row, 
column, and overall means. 



From (9), we see that Theorem 1 is not true for an arbitrary set of nonnegative weights {Wy}. Even so, W- 

restrictions with weights W y = n y yield a nice solution to the ANOVA problem, Fujikoshi (1993). 

Bradu and Gabriel (1974) called contrasts in T that have the simple residual form given by (17) a 

product-type contrast. Using the matrix T or the matrix of means U, a product-type contrast is defined as 

= a'Tb = a'Ub where the elements of a and b are contrasts, the elements of a and b sum to zero 

(La i ='Lbj = 0). Even though the interactions do not have the residual form given in (17) for arbitrary W- 
* j 

restrictions, we have the following general result. 

Theorem 2. For Yy defined as the residual y y = fi y - /i-a i - (5j f y/ = a'Ub = a'Tb for all 

contrast vectors a and b and its value does not depend on £ UV- or W-restrictions. 

More generally, Bradu and Gabriel (1974), Milliken and Johnson (1992, p. 116) and Boik (1993) 
define interaction contrasts as 

a b 

y/= X X CyHy* or equivalently y/ = trace (C^U) (18) 

<=1 y=l 

where is an ax b matrix with elements {c^ }; the elements in each row and each column of sum 
to zero (are contrast vectors) and the function Trace (•) is defined as the sum of the diagonal elements of a 
square matrix. Thus, a product-type contrast is a special case of an interaction contrast in which the 
coefficient matrix = ab', an outer product matrix of rank one. Again, we have the following general 
result. 

Theorem 3. For Yy defined as the residual Yij = fly - - fij, y/ = Trace (C^U) = 

Trace (C^T) for all contrast matrices and its value does not depend on £ -, UV- or W- 
restrictions. 

Thus, product-type contrasts in fi y or y^ are a subset of all interaction contrasts. 

When one rejects the overall test of interaction using an overall F-statistic, we know from Scheffd 
(1953; 1959, p. 109) that either an individual interaction term y^ is significantly different from zero or 

some parametric function: 

0 = ZZCijYij = Trace (CT) (19) 

i j 

is nonzero for some matrix of coefficients C aXb = {cy}. Comparing (19) and (18), we see that the matrix 
C aXb is an arbitrary matrix, there are no restrictions on the sum of the elements Cy 9 while is a contrast 

matrix in which the elements in each row and each column sum to zero. There is no contradiction here 
since by (18) and Theorem 3, all linear combinations 0are contained in y/. This result is implicit in Boik 
(1993). To see this, there exists weight matrices W fl and such that W fl UW fc = T. Hence, 

C'W fl UW fc = CT for an arbitrary matrix C. However, the Trace (CT) = Trace (C' W fl UW fc ) = 

Trace (W 6 C' W fl U) = Trace (C^U) = Trace (C^T) since W fe C' W fl is a contrast matrix. This 



establishes the result. For the equal weight case, W fl and have a simple form. In general, their 
construction is more complicated, Fujikoshi (1993). 

When performing a post-hoc analysis following a test of interaction, one first evaluates which, if any, 
of the individual interaction parameters, a special case of product contrasts, are significantly different from 
zero as recommended by Rosnow and Rosenthal (1989a, 1989b) and Boik (1979, 1993). However, one 
may have to look beyond the individual to establish significance. From Table 3, the estimates of the 
interaction terms are different, depending on the design weights chosen. Hence, their interpretation depends 
on the weights used in the study. 

To determine whether an individual interaction or linear combination y/ = Trace (C^T) = 

Trace (C^U) of the interactions is significantly different from zero following a significant overall F-test, 
one may employ the 5-method, Scheffc (1953). The simultaneous (1- a) confidence intervals are given by 

yr-Sa^ Zy/Z y> + 5a- (20) 

where y/ is an unbiased estimator of y/ y is its estimated standard error and 5 2 is the critical constant 

s 2 = (2D 

where v h = (a-l)(b-l) and v e = N -ab, for our example. To evaluate (20) for our example, one merely 
estimates y/ and a ^ with 5 = 7(2)(3)(2.60) = 3.95. Boik (1993) provides a general matrix formula for 

obtaining the variance of an interaction contrast estimator y / . 

For our example, the only solutions that are applicable are the “equal” weight and “balanced” weight 
cases. For the equal weight case, using (7), the estimated variance of y g is 



*2 _ -2 
O £ — CT 

' ii 



izaV £ _L JldL ) 2 £ _L + _L_ £ £ _L 

V ab J v ab J j '= i n u . V ab J j'=i n- (ab) ('= j /=• 

y'Vi 1 1 ** j**j 



To obtain the estimated variance for the balanced weight case, equation (15) is used. That is, 

Yij = yy. - k. - y.j. + y... = y v . - z /w^-z / w +J + z £ w ijyiJ , / w ++ . 

j i i j 

The estimated standard errors for each y are given in Table 4. The values of Wy are obtained from 
Table 1, the sample sizes are shown in Table 2, and from the denominator of the F-statistic, 



MS, = ct 2 =113.50. 



Table 4 Estimated Standard Errors, o - 



Equal Weight Case (Wy = 1) 



4.996 


4.319 


4.733 


4.896 


5.055 


4.563 



(22) 



(23) 




9 10 



4.556 

4.950 

4.620 



4.434 

4.705 

4.950 





Balanced Weight Case (Wy 


= w i+ w +j ) 




4.956 


4.426 


5.469 


3.671 


5.461 


4.566 


5.211 


4.160 


5.045 


4.832 


5.550 


3.902 



Comparing the absolute value of 7^, ly^l, with the critical constant So* for each interaction y y for 

both the “equal” weight and the “balanced” weight cases, y 12 and y 14 are significantly different from zero 
for both weighting schemes. However, y 32 is significant for the equal weight scheme, but not the balanced 
weight case. The interval is: 

Equal -40.449 < y 32 < -1.343 
M ' 32 (24) 

Balanced -30.551 < y 32 < 5.521. 

Because the definition of the interaction parameters y i; - depend on the weighting scheme, it is not surprising 
to find that the confidence sets for the 7^ differ in size. Thus, the significance or nonsignificance of a 7^ 
may be reported differently by two researchers for the same set of data. The only difference is in the design 
weights selected for the analysis. This fact is often overlooked by applied researchers when discussing the 
analysis of a CRF-ab design. 

In our analysis of the interactions, we chose to investigate the individual 7^ to locate significance 
following the rejection of the overall F-test. Often these individual 7^ are not significant and one must 
locate the contrast in the y^’s or equivalently the fly ' s that led to the rejection of the overall test. To find 
the most significant contrast in Yij (f^ij ) that led to rejection of the overall test, the contrast matrix C' AB 
must be selected proportional to V' for a design with equal cell frequencies n y = n, Hochberg and 
Tamhane (1987, p. 296). This is the case since MS H = SS H / v h and SS H in (12) for equal n = n y has the 
form SS H = n Trace(TT). Extending this result to the unbalanced design, we let Wy = n y . Then the most 
significant (maximum) contrast is 

V = Z(/i l7 7y)7y = Trace[(N*f )T] = Trace(C^r) = Trace(C^U) (25) 



ij 



where the matrix N = {n ^ } , the matrix of cell frequencies, and N*T = {n^y^}, is a Hadamard product. For 
Wy = ity, one substitutes 7^ into (25) to obtain the coefficient matrix C' AB for y/. The value of 
y / = 9560.00 = SS H , as expected. Using (20), the confidence interval for y/ in (25) is 



y v-SOf <1 y< V + So^ 

9560 -(3.95)(1041.64) < yr < 9560 + (3.95)(1 04 1.64) 
5445.85 <y r< 13674.18 



( 26 ) 



which is significant, does not depend on the design weights, but is impossible to interpret. 



Another class of contrasts that are often studied following the F-test are called tetrad contrasts. Tetrad 
contrasts involve four cells in the design, may be generated from the matrix V = C^UC fi , where 

and are simple contrast matrices, and do not depend on the design weights. To illustrate, we let 

f 1 0 O' 

( i o -n 

0 1 -1 



C' = 



and C B - 



0 1 0 

0 0 1 

-1 -1 -1 



(27) 



for our example. Then = c^c^) where c A (q is the I th column of and is the y ,th column of 

C B so that iffy = = Trace(C^U) = Trace(C^ fi r) is a tetrad contrast. Thus, tetrad contrasts are 

a subset of all product contrasts. 

For the matrices C A and C B defined in (27), the six tetrad contrasts are shown in Table 5. In Table 5, 



observe that only one of the tetrad contrasts is significantly different from zero using the 5-method, for our 
example. 

Table 5 Select Tetrad Contrasts Using the 5-Method 



¥ 


¥ 


** 


(Sig) 


Low Limit 


Upper Limit 


Y\l ~/31 +734 


27.250 


11.911 




-19.794 


74.294 


ri2-/32-ru+r34 


88.917 


12.680 


* 


38.834 


138.999 


Yl3 ~Y33 “/l4 +734 


26.917 


13.405 




-26.030 


79.863 


721 "731 "724 +734 


-31.500 


13.754 




-85.822 


22.822 


722 "732 "724 + 734 


9.333 


13.754 




-44.989 


63.655 


723 " 733 ”724 +734 


21.333 


13.754 




-32.989 


75.655 



Having selected the design weights for a two-way CRF-ab design, the 5-method is the most appropriate 
simultaneous test procedure for evaluating the significance of an arbitrary number of linear combinations of 
the Yy. If one a-priori restricts their investigation to only product-type contrasts, the maximal F-test should 
be calculated to perform the overall test of significance since the procedure is more powerful than the F- 
test, Boik (1993). The maximal F-test uses the Studentized Maximum Root (SMR) distribution, the critical 
value for a two-way design is where P = min {a — 1, b — 1), q = max (a - 1, b - 1) and v e — N - ab. 

When p = 1, R X p£ Ve = <1^1^ reduces to the F-distribution. 

For our example, p = 2 and q = 3. For a = 0.05, the critical value of the SMR distribution is 
^2,3,20 = 13.221 so that R = 3.64 < 5 = 3.95. Hence, using (20) and substituting R for 5, we see that the 
confidence set for product contrasts will always be shorter, and hence more resolute. Boik (1993) 
recommends the procedure be used when one is only interested in product-type contrasts, which include for 
example the individual and tetrad contrasts. Again, the y ^ depend on the weighting scheme. 



Using a full rank cell means model, Boik (1993) developed SAS (SAS Institute, 1990) and SPSS 
(1990) programs to calculate the maximal F-statistic. Using the program for our example, the maximal F- 
statistic is: 



w . , „ (a'fb ) 2 (a'Ub ) 2 (51.805) 2 cc 

Maximal F = — ^ — = — = 68.55 

a\ 6.257 

where the maximal contrast vectors are: 



0.8156^ 




' 0.1 273 > 




0.7333 


-0.4406 


and b = 


-0.2358 


-0.3750 , 






/ 




^ -0.6248 ; 



(28) 



(29) 



Comparing the maximal F-ratio to the SMR critical value /?2,3,20 = 13.221, the interaction hypothesis is 



rejected. The maximal product contrast, similar to the maximum generalized contrast given in (26), is also 
difficult to interpret. However, using it as a guide, a product contrast that may be more meaningful is 
y/ = a'Ub where a' = (1,-1 / 2,-1 / 2) and b' = (0, 1, 0, - 1) which compares A l with the average of 
A 2 and A 3 for the levels of factor B at B 2 and B 4 . For this contrast, y/ = 84.25, a ^ = 10.653 and 

y a / <7^ = 7.909 > R = 3.64 so that the comparison is significant. While one may continue to “data-snoop” 



among the product contrasts, including the individual to locate significant interactions that may be 
meaningful, the simplest and most easily interpreted contrasts are again the tetrad contrasts. Using the 
maximal F-test, the intervals shown in Table 5 would be more resolute. However, can we do better? The 
answer is yes. If a researcher is only interested in the ( 2 X 2 ) tetrad contrasts, the finite intersection test 



(FIT) procedure developed by Krishnaiah (1964, 1965) yields the shortest confidence sets. Furthermore, all 
tetrad contrasts are easily interpreted and they are weight invariant. The FIT procedure uses the 
multivariate F-distribution and requires the use of a computer program to approximate its critical values 
Cox et al. (1994) and Timm (1995). 

To evaluate the significance or nonsignificance of the ( 2 )( 2 ) = tetrad contrasts for our example, the 
statistics 7J 2 = xj/f / are calculated and each is compared to the critical value of the multivariate F- 

distribution with 1 and 20 degrees of freedom. For a = 0.05,Sid&k’s upper product bound (see e.g. 
Krishnaiah, 1979 or Cox et al., 1980 for a discussion of calculating Sid&k’s bound) for the multivariate F- 
distribution is 1 1.266. Thus, to establish 1 - a simultaneous confidence intervals the critical constant 

FIT = VI 1.266 = 3.36 < R = 3.63 < 5 = 3.95 (30) 

is always less than the corresponding critical constants for the SMR distribution or the F-distribution. The 
FIT is uniformly shorter if one is only interested in tetrad contrasts. 

To use the FIT program, the cell means for our example are arranged into a vector: 

= (Mi1’M12’M13’F14’F21’F22’F23’F24’M31>M32>M33’M34) ( 31 ) 
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and a contrast matrix for the 18 tetrad contrasts is input to the FIT program: 



(l 


-1 


0 


0 


-1 


1 


0 


0 


0 


0 


0 


0> 


1 


0 


-1 


0 


-1 


0 


1 


0 


0 


0 


0 


0 


1 


0 


0 


-1 


-1 


0 


0 


1 


0 


0 


0 


0 


0 


1 


-1 


0 


0 


-1 


1 


0 


0 


0 


0 


0 


0 


1 


0 


-1 


0 


-1 


0 


1 


0 


0 


0 


0 


0 


0 


1 


-1 


0 


0 


-1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


1 


-1 


0 


0 


-1 


1 


0 


0 


0 


0 


0 


0 


1 


0 


-1 


0 


-1 


0 


1 


0 


0 


0 


0 


0 


1 


0 


0 


-1 


-1 


0 


0 


1 


0 


0 


0 


0 


0 


1 


-1 


0 


0 


-1 


1 


0 


0 


0 


0 


0 


0 


1 


0 


-1 


0 


-1 


0 


1 


0 


0 


0 


0 


0 


0 


1 


-1 


0 


0 


-1 


1 


1 


-1 


0 


0 


0 


0 


0 


0 


-1 


1 


0 


0 


1 


0 


-1 


0 


0 


0 


0 


0 


-1 


0 


1 


0 


1 


0 


0 


-1 


0 


0 


0 


0 


-1 


0 


0 


1 


0 


1 


-1 


0 


0 


0 


0 


0 


0 


-1 


1 


0 


0 


1 


0 


-1 


0 


0 


0 


0 


0 


-1 


0 


1 


,0 


0 


1 


-1 


0 


0 


0 


0 


0 


0 


-1 


1, 



Letting = C'/z, the estimate of \f/ i is \f/ i = cj/i where /i is the vector of cell means, y/ i and c\ are the I th 
row of 'F and C', respectively. The output for the FIT program is provided in Table 6. 



Table 6 FIT Output for (3x4) Design 
TWO SIDED FINITE INTERSECTION TEST 

(* INDICATES TO REJECT THE SUBHYPOTHESIS OF NO DIFFERENCE) 



VARIABLE 1: SCORE 

DEGREES OF FREEDOM 1 , 20 

S-SQU ARE/NDF: 1 1 3 .4960 

SIDAK’S UPPER BOUND ON 
MULTIVARIATE F 
11.266 

CRITCAL VALUE: 0.050 

LEVEL OF SIGNIFICANCE FOR VARIABLE 1 : 
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LINEAR LINEAR COMBINATIONS OF THE ORIGINAL MEANS 



COMBINATION 


STATISTIC 


ACC/REJ 


ESTIMATE 


CONFIDENCE 


INTERVAL 


1 


2.549 




20.8333 


[ -64.6277, 


22.9610] 


2 


14.943 


* 


53.1667 


[ 7.0033, 


99.3300 ] 


3 


19.207 


* 


58.7500 


[ 13.7556, 


103.7444 ] 


4 


32.166 


* 


74.0000 


[ 30.2056, 


117.7944] 


5 


39.391 


* 


79.5833 


[ 37.0229, 


112.1438] 


6 


0.173 




5.5833 


[ -39.4111, 


50.5778 ] 


7 


8.815 




-40.8333 


[ -86.9967, 


5.3300] 


8 


14.757 


* 


-52.8333 


[ -98.9966, 


-6.6700 ] 


9 


5.246 




-31.5000 


[ -77.6633, 


-14.6633 ] 


10 


0.761 




-12.0000 


[ -58.1633, 


-34.1633 ] 


11 


0.461 




9.3333 


[ -36.8300, 


55.4966 ] 


12 


2.406 




21.3333 


[ -24.8300, 


67.4966 ] 


13 


22.337 


* 


-61.6667 


[-105.4610, 


-17.8723] 


14 


0.001 




0.3333 


[ -45.8300, 


46.4967 ] 


15 


5.234 




27.2500 


[ -12.7286, 


67.2286 ] 


16 


18.474 


* 


62.0000 


[ 13.5835, 


110.4165] 


17 


49.172 


* 


88.9167 


[ 46.3562, 


131.4771 ] 


18 


4.032 




26.9167 


[ -18.0778, 


71.9111 ] 



clearly shows the significant tetrad contrasts for the Overall and Spiegel data. Comparing entries 15, 17, 

18, 9, 1 1, and 12 in this table with those in Table 5, observe that the tetrad confidence intervals for the FIT 
procedure are shorter in all cases. Given the relationship among the critical constants in (30), the FIT 
procedure also produces confidences intervals that would be shorter than those realized using the maximal 
F-criterion. Hence, if one is only interested in tetrad contrasts, the FIT procedure should be utilized in the 
analysis of interactions for an ax b design. Of course the FIT procedure is not limited to tetrad contrasts. 

It may be used with any finite number of contrasts. 

To perform a step-down FIT procedure for this example, one would remove tetrad 17, 
y/ = [i n - - ^ 32 + and recalculate the overall multivariate F-critical value, continuing to remove 

the tetrad corresponding to the largest statistic at each step and stopping the process when nonsignificance 
is realized. For the example, the step-down procedure found one more tetrad to be significant, tetrad 
number 15. The sequence of multivariate F-critical values was: 

{11.266, 10.946, 10.771, 10.586, 10.388, 10.176, 9.946, 9.697, 9.424} 

for the step-down FIT procedure. The step-down process will always yield at least the number of 
significant contrasts found using the single-step procedure, and maybe more. However, it is difficult to 
establish 1 - a simultaneous confidence bounds for the population parameters at each step, Timm (1995). 

Conclusion 

This paper began with a researcher interested in analyzing whether there is a significant interaction 
between two factors in a completely randomized factorial design. In implementing the study, an unbalanced 
nonorthogonal design resulted. To analyze the data, one researcher used equal cell weights and the other 
used proportional cell or balanced weights. In performing the analysis, we found that the interaction 
parameters and their corresponding estimates Yij depend on the weighting scheme selected. Hence, 
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when constructing confidence intervals for y y or parametric functions of the y t j one must discuss the 
weights used in the analysis. 

While this F-test and corresponding S-method is used to investigate all y^ and contrasts in the y^, a 
more powerful test exists if one restricts their investigation of interactions to y^ and only product contrasts 
in the y iJ . The most powerful overall test is the maximal F-test. When constructing confidence sets for the 
Yij, we saw that they still depend on the weighting scheme for the analysis. If a researcher in the study of 
interactions is only interested in all tetrad contrasts of the y y, no overall test is performed. Instead, one 

uses the finite intersection test (FIT) procedure for the analysis. The overall test of significant interaction is 
rejected if any tetrad is significantly different from zero. This approach is particularly attractive in the 
analysis of interactions for unbalanced designs since the procedure does not depend on the weighting 
scheme selected. 

In performing an analysis of interactions in unbalanced factorial designs, researchers have a 
responsibility for reporting the design weights when analyzing interactions y-y. If one is only interested in 

tetrad contrasts, the FIT procedure yields confidence sets that have the smallest probability of covering zero 
and hence are more likely to yield significant results. 




ERIC 



15 



References 



Appelbaum, M.I., & Cramer, E.M. (1974). Some problems in the nonorthogonal analysis of variance. 
Psychological Bulletin. 81 . 335-343. 

Arnold, A. F. (1981). The theory of linear models and multivariate analysis . New York: John Wiley. 

Boik, R.J. (1979). Interactions, partial interactions, and interaction contrasts in the analysis of 
variance. Psychological Bulletin, 86 !. 1084-1089. 

Boik, R.J. (1993). The analysis of two-factor interactions in fixed effects linear models. Journal of 
Educational Statistics, 18 , 1-40. 

Bradu, D., & Gabriel, K.R. (1974). Simultaneous statistical inference on interactions in two-way 
analysis of variance. Journal of the American Statistical Association, 69 . 428-436. 

BMDP (1992). BMDP statistical software manual . Vol. 1 & 2. (Release 7.0 Ed.l. W.J. Dixon, editor. 
Los Angeles, CA: University of California Press. 

Carlson, J.E., & Timm, N.H. (1974). Analysis of nonorthogonal fixed-effects designs. Psychological 
Bulletin, 8L 563-570 

Cohen, J. (1988). Statistical power analysis for behavioral science (2 nd ed.V Hillside, NJ: Erlbaum. 

Cox, C.M., Krishnaiah, P.R., Lee, J.C., Reising, J. and Schuurman (1980). A study of finite 
intersection tests for multiple comparisons of means. Multivariable analvsis-V (P.K. Krishnaiah, Ed.), 435- 
466. New York: Academic Press. 

Cox, C.M., Fang, C., Boudreau, R.M., & Timm, N.H. (1994). Computer program for Krishnaiah' s 
finite intersection tests for multiple comparisons of mean vectors . Interim Report No. 94-04, Research 
Methodology Program, University of Pittsburgh, Pittsburgh, PA. 

Cramer, E.M., & Appelbaum, M.I. (1980). Nonorthogonal analysis of variance — Once again. 
Psychological Bulletin. 87 , 51-57. 

Davidson, M., & Toporek, J. (1991). BMDP 4V - General univariate and multivariate analysis of 
variance . BMDP Technical Report No. 67. Los Angeles, CA: BMDP Statistical Software, Inc. 

Fujikoshi, Y. (1993). Two-way ANOVA models with unbalanced data. Discrete Mathematics. 1 16 . 
315-334. 

Graybill, F.A. (1976). Theory and application of the linear model . Boston: Duxburg Press. 

Hockberg, Y., & Tamhane, A.C. (1987). Multiple comparison procedures . New York: Wiley. 

Keren, G., & Lewis, C. (1977). A comment on coding in nonorthogonal designs. Psychological 
Bulletin, 84 , 346-348. 

Kirk, R.E. (1995). Experimental design: Procedure for the behavioral sciences , (3 rd Edition). Pacific 
Grove, CA: Brooks/Cole. 

Krishnaiah, P.R. (1964). Multiple comparison tests in multivariate case . ARL 64-124. Wright- 
Patterson Air Force Base, Ohio. 




16 17 



Krishnaiah, P.R. (1965). On the simultaneous ANOVA tests. Annals of the Institute of Statistical 
Mathematics, 17 , 167-173. 

Krishnaiah, P.R. (1979). Some developments on simultaneous test procedures. Developments in 
statistics , (P.R. Krishnaiah, Ed.), Vol. 2, 157-201. New York: Academic Press. 

Lewis, C., & Keren, G. (1977). You can’t have your cake and eat it too: Some considerations of the 
error term. Psychological Bulletin, 84 , 1150-1154. 

Milliken, G. A., & Johnson, D.E. (1992). Analysis of messv data . New York: Chapman and Hall. 

O’Brien, R.G. (1976). Comments on “Some problems in the nonorthogonal analysis of variance,” 
Psychological Bulletin, 83 , 72-74. 

Overall, J.E., & Spiegel, D.K. (1969). Concerning least squares analysis of experimental data. 
Psychological Bulletin. 72 , 311-322. 



Overall, J.E. & Spiegel, D.K. (1973). Comment on “Regression analysis of proportional data.” 
Psychological Bulletin, 80 , 28-30. 

Overall, J.E., Spiegel, D.K., & Cohen, J. (1975). Equivalence of orthogonal and nonorthogonal 
analysis of variance. Psychological Bulletin, 82 , 182-186. 

Rawlings, R.R., Jr. (1972). Note on nonorthogonal analysis of variance. Psychological Bulletin, 77 , 
373-374. 

Rawlings, R.R., Jr. (1973). Comments on the Overall and Spiegel paper. Psychological Bulletin, 79 . 
168-169. 

Rosnow, R.L., & Rosenthal, R. (1989a). Definition and interpretation of interaction effects. 
Psychological Bulletin, 105 , 143-146. 

Rosnow, R.L., & Rosenthal, R. (1989b). Statistical procedures and the justification of knowledge in 
psychological science. American Psychologist, 44 , 1276-1284. 

SAS Institute (1990). S AS/S TAT user’s guide (Version 6, 4 th ed). Cary, NC: Author 

Scheffe, H. (1953). A method forjudging all contrasts in the analysis of variance. Biometrika, 40 . 87- 

104. 



Scheffe, H. (1959). The analysis of variance . New York: Wiley. 

Searle, S.R. (1971). Linear models . New York: Wiley. 

Searle, S.R. (1993). Unbalanced data and cell means models. Applied analysis of variance in 
behavioral science (L.K. Edwards, Ed.), 375-420. New York: Marcel Dekker. 

SPSS. (1990). SPSS reference guide . Chicago: Author. 

Timm, N.H., & Carlson, J.E. (1975). Analysis of variance through full rank models. Multivariate 
Behavioral Research, Monograph , 1975, No. 75-1. 

Timm, N.H. (1995). Simultaneous inference using finite intersection tests: A better mousetrap. 
Multivariate Behavioral Research, 30 (4), 461-512. 




17 18 




TM028354 



U.S. DEPARTMENT OF EDUCATION 

Office of Educational Research and Improvement (OERI) 
Educational Resources Information Center (ERIC) 



REPRODUCTION RELEASE 

(Specific Document) 




I. DOCUMENT IDENTIFICATION: 



Tit!e: Analyzing Interactions 


in Unbalanced Two-way Designs 


AU,h0r(S): Neil H. Timm 


Corporate Source: 




Publication Date: 


University of Pittsburgh 




April 6 , 1998 



II. REPRODUCTION RELEASE: 



In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents 
announced in the monthly abstract journal of the ERIC system. Resources in Education (RIE), are usually made available to users 
in microfiche, reproduced paper copy, and electronic/optical media, and sold through the ERIC Document Reproduction Service 
(EDRS) or other ERIC vendors. Credit is given to the source of each document, and, if reproduction release is granted, one of the 
following notices is affixed to the document. 



If permission is granted to reproduce the identified document, please CHECK ONE of the following options and sign the release 
below. 

£ Sample sticker to be affixed to document Sample sticker to be affixed to document | | 



Check here 

Permitting 
microfiche 
(4* x 6" film), 
paper copy, 
electronic, and 
optical media 
reproduction. 



“PERMISSION to reproduce this 




“PERMISSION TO REPRODUCE THIS 


MATERIAL HAS BEEN GRANTED 5Y 




MATERIAL IN OTHER THAN PAPER 
COPY HAS BEEN GRANTED BY 








IITV . — 




.\V> 


TO THE EDUCATIONAL RESOURCES 




TO THE EDUCATIONAL RESOURCES 


INFORMATION CENTER (ERIC'.’ 




INFORMATION CENTER (ERIC)’ 



or here 

Permitting 
reproduction 
in other than 
paper copy. 



Level 1 



Level 2 



Sign Here, Please 

Documents will be processed as indicated provided reproduction quality permits. If permission to reproduce is granted, but 
neither box is checked, documents will be processed at Level 1. 



“1 hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce this document as 
indicated above. Reproduction from the ERIC microfiche or electronic/optical media by persons other than ERIC employees and its 
system contractors requires permission from the copyright holder. Exception is made for non-profit reproduction by libraries and other 
service agenciesjO'S^tisfy information needs of educators in response to discrete inquiries." 




Position: 

Professor 


Printed Name: „ , / , , 

A leu H • 


Organization: 

University of Pittsburgh 


Address. 5^30 Forbes Quad 
Department of Psychology in Education 
Pittsburgh, PA 15260 


Telephone Number: / , 

' 412 > 624-7233 


Date:April 6 , 1998 



OVER 




